Skip to content

Conversation

Fuzuki785
Copy link
Contributor

This is:

  • a bugfix
  • a new feature
  • refactoring
  • additional unit tests

Checklist:

  • Changes are covered by unit tests
    • Changes are covered by existing unit tests
    • New unit tests have been added
  • Code style is respected
  • Commit message explains why the change is made (see https://github.com/erlang/otp/wiki/Writing-good-commit-messages)
  • CHANGELOG.md contains a short summary of the change and a link to the pull request if applicable
  • Documentation is updated as necessary

Why this change is needed?

Related issue #4607

Improved the speed of the rangeToArrayYield method in Worksheet by:

  • using a binary search algorithm first to approach the right index, before using linear search backwards and forwards to realign properly (limits the linear search iterations to about a row's worth of indices rather than the whole file's worth)
  • improving the speed of the getCoordinates, getSortedCoordinates, getSortedCoordinatesInt method in Collections/Cells by adding a flag to avoid sorting the index array on every call and cache arrays to avoid using array_keys and array_values on every call to the methods

@oleibman
Copy link
Collaborator

@Fuzuki785 Thank you for the PR. It is difficult for me to evaluate its effectiveness. If I were to create a private repository with access allowed only to you and me, would you be willing to upload your file and code so that I can observe the performance improvements first-hand?

@Fuzuki785
Copy link
Contributor Author

@oleibman I created a repository with a minimalist configuration for the issue, using sample data with only 95000 cells (instead of 1.8M cells): Fuzuki785/PhpSpreadsheet-4607-reprod.

I've copied the execution logs (time to read rows, file loading excluded) here from the README file for reference:

# Base version (5.0)
❯ php Reprod.php
Progress: 999
Time delta: 8.6829102039337 s
Progress: 1999
Time delta: 8.9077999591827 s
Progress: 2999
Time delta: 9.0496628284454 s
Progress: 3999
Time delta: 9.2766671180725 s
Progress: 4999
Time delta: 9.5125539302826 s
Total time: 45.43901014328 s
# Patched version
❯ php Reprod.php
Progress: 999
Time delta: 0.10127186775208 s
Progress: 1999
Time delta: 0.045258045196533 s
Progress: 2999
Time delta: 0.065037965774536 s
Progress: 3999
Time delta: 0.04623007774353 s
Progress: 4999
Time delta: 0.04681396484375 s
Total time: 0.30479502677917 s

@oleibman
Copy link
Collaborator

oleibman commented Sep 5, 2025

Thank you for the sample file and test program. I have downloaded them, and confirm your test results. I need to do my due diligence to figure out what, if anything, else is needed to move this change forward. But, so far, it looks really promising!

@oleibman oleibman added this pull request to the merge queue Sep 7, 2025
@oleibman
Copy link
Collaborator

oleibman commented Sep 7, 2025

Thank you for your well-thought-out and well-documented improvement.

Merged via the queue into PHPOffice:master with commit 66ea405 Sep 7, 2025
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants